BLS Employment Revisions Analysis

Examining the Accuracy and Political Neutrality of U.S. Jobs Data

Author

Mirae Han

Introduction

Understanding Employment Data Revisions

The Bureau of Labor Statistics (BLS) releases monthly employment reports that significantly impact economic policy, financial markets, and political discourse. However, these initial estimates are later revised as more complete data becomes available. Understanding the nature, magnitude, and patterns of these revisions is crucial for:

  • Economic Policy: Central banks and policymakers rely on accurate employment data
  • Market Transparency: Investors need to understand data reliability
  • Political Accountability: Examining claims of data manipulation
  • Statistical Methodology: Assessing the quality of BLS estimation procedures

Research Questions

This analysis examines BLS Current Employment Statistics (CES) revisions from 1979 to 2025 to answer:

  1. Historical Patterns: What are the largest revisions in CES history?
  2. Directional Bias: Are revisions systematically positive or negative?
  3. Temporal Trends: Has revision accuracy improved over time?
  4. Seasonal Effects: Do certain months show larger revisions?
  5. Political Neutrality: Do revisions differ by presidential administration?
  6. Statistical Inference: What can computational methods reveal about partisan claims?

Data Sources

CES Total Nonfarm Payroll (Series ID: CES0000000001)
- Source: Bureau of Labor Statistics Data Portal
- Coverage: January 1979 - June 2025
- Variables: Employment level (thousands of jobs)

CES Revision Data
- Source: BLS Employment Situation Historical Revisions
- Contains: Original estimates, final estimates, and calculated revisions - Updates: Monthly with annual comprehensive revisions

Data Acquisition

Task 1: CES Total Nonfarm Payroll Data

Show code
#' Download CES Total Nonfarm Employment Data from BLS
get_ces_employment <- function() {
  
  ces_employment <- request("https://data.bls.gov/pdq/SurveyOutputServlet") |>
    req_user_agent("Mozilla/5.0") |>
    req_body_form(
      series_id = "CES0000000001",
      years_option = "specific_years", 
      from_year = "1979",
      to_year = "2025",
      periods_option = "all_periods",
      output_type = "column",
      output_view = "data"
    ) |>
    req_perform() |>
    resp_body_html() |>
    html_table() |>
    pluck(2) |>
    mutate(
      month = str_remove(Period, "M"),
      date = ym(paste(Year, month)),
      level = as.numeric(Value)
    ) |>
    select(date, level) |>
    drop_na() |>
    filter(date >= ym("1979-01"), date <= ym("2025-06")) |>
    arrange(date)
  
  return(ces_employment)
}

# Load employment data
ces_employment <- get_ces_employment()

Successfully downloaded 558 months of employment data spanning.

Key Observations:

  • The dataset contains the total nonfarm payroll employment level for each month, measured in thousands of jobs
  • This represents the “headline” employment figure that is widely reported in the media each month
  • These are the original estimates released by BLS, which will later be compared to revised figures
  • The employment level has generally trended upward over this 45+ year period, reflecting U.S. labor force growth

Task 2: CES Revision Data

Show code
#' Download BLS CES Revision Data from Historical Tables
get_ces_revisions <- function() {
  
  # Fetch HTML page with revision tables
  resp <- request("https://www.bls.gov/web/empsit/cesnaicsrev.htm") |>
    req_user_agent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36") |>
    req_headers(Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8") |>
    req_perform()
  
  html_content <- resp |> resp_body_html()
  
  # Function to extract data for a specific year
  extract_year <- function(year) {
    xpath <- paste0("//table[@id='", year, "']")
    
    html_content |>
      html_element(xpath = xpath) |>
      html_table(header = FALSE) |>
      slice(4:15) |>
      select(month = X1, year_col = X2, original = X3, final = X5) |>
      mutate(
        month = str_extract(month, "^[A-Z][a-z]{2}"),
        date = ym(paste(year, month)),
        original = as.numeric(original),
        final = as.numeric(final),
        revision = final - original
      ) |>
      select(date, original, final, revision) |>
      drop_na()
  }
  
  # Extract all years and combine
  ces_revisions <- map_dfr(1979:2025, extract_year) |>
    arrange(date)
  
  return(ces_revisions)
}

# Load revision data
ces_revisions <- get_ces_revisions()

Extracted 559 monthly revisions from 1979-01-01 to 2025-07-01.

Key Observations:

  • Each row represents a monthly employment estimate that has been revised at least once
  • The original column shows BLS’s initial estimate released to the public
  • The final column shows the revised estimate after more complete data became available
  • The revision column is calculated as (final - original), so:
    • Positive revisions = BLS initially underestimated job growth
    • Negative revisions = BLS initially overestimated job growth
  • Revisions can occur months or even years after the original release as survey response rates improve

Revision Data Sample

Show code
kable(head(ces_revisions, 10),
      caption = "First 10 months of CES revision data",
      col.names = c("Date", "Original Estimate", "Final Estimate", "Revision"),
      digits = 1)
First 10 months of CES revision data
Date Original Estimate Final Estimate Revision
1979-01-01 325 243 -82
1979-02-01 301 294 -7
1979-03-01 324 445 121
1979-04-01 72 -15 -87
1979-05-01 171 291 120
1979-06-01 97 225 128
1979-07-01 44 87 43
1979-08-01 2 49 47
1979-09-01 135 41 -94
1979-10-01 306 179 -127

Data Exploration

Task 3: Combined Dataset Creation

Show code
# Merge employment levels with revisions
ces_data <- ces_employment |>
  left_join(ces_revisions, by = "date") |>
  mutate(
    year = year(date),
    month = month(date, label = TRUE),
    decade = floor(year / 10) * 10,
    abs_revision = abs(revision),
    revision_pct_of_final = (revision / final) * 100,
    revision_pct_of_level = (revision / level) * 100,
    is_positive_revision = revision > 0
  ) |>
  drop_na()

Combined dataset contains 558 observations with complete employment and revision information.

What This Dataset Tells Us:

  • By joining employment levels with revisions, we can analyze revision patterns in context
  • We’ve added several calculated variables to facilitate analysis:
    • abs_revision: Absolute value of revision (magnitude regardless of direction)
    • revision_pct_of_level: Revision as a percentage of total employment (shows relative impact)
    • is_positive_revision: Boolean indicator for directional analysis
    • decade: Groups data by 10-year periods for temporal trends
  • Any months without complete revision data are excluded (using drop_na())
  • This clean dataset is now ready for statistical analysis and visualization

Summary Statistics

Statistic 1: Largest Revisions in History

Show code
# Largest positive revision
stat1_positive <- ces_data |>
  arrange(desc(revision)) |>
  slice(1) |>
  select(date, revision, level, final)

# Largest negative revision
stat1_negative <- ces_data |>
  arrange(revision) |>
  slice(1) |>
  select(date, revision, level, final)

Largest POSITIVE Revision:

Show code
kable(stat1_positive,
      col.names = c("Date", "Revision (000s)", "Employment Level", "Final Estimate"),
      caption = "Month with largest upward revision",
      digits = 1)
Month with largest upward revision
Date Revision (000s) Employment Level Final Estimate
2021-11-01 437 149206 647

Largest NEGATIVE Revision:

Show code
kable(stat1_negative,
      col.names = c("Date", "Revision (000s)", "Employment Level", "Final Estimate"),
      caption = "Month with largest downward revision",
      digits = 1)
Month with largest downward revision
Date Revision (000s) Employment Level Final Estimate
2020-03-01 -672 150895 -1373

Interpretation:

  • The extreme positive and negative revisions likely occurred during periods of economic volatility
  • Large revisions often coincide with recessions or rapid economic changes when initial estimates are most difficult
  • These outliers demonstrate that while BLS methodology is generally accurate, significant estimation errors can occur during turbulent times
  • The magnitude of these revisions relative to total employment level provides context for their economic significance

Statistic 2: Fraction of Positive Revisions

Show code
# By year
stat2_year <- ces_data |>
  filter(year <= 2024) |>
  group_by(year) |>
  summarise(
    total_months = n(),
    positive_revisions = sum(is_positive_revision),
    fraction_positive = positive_revisions / total_months
  )

# By decade
stat2_decade <- ces_data |>
  group_by(decade) |>
  summarise(
    total_months = n(),
    positive_revisions = sum(is_positive_revision),
    fraction_positive = positive_revisions / total_months
  )

By Decade:

Show code
kable(stat2_decade,
      col.names = c("Decade", "Total Months", "Positive Revisions", "Fraction Positive"),
      caption = "Proportion of positive revisions by decade",
      digits = 3)
Proportion of positive revisions by decade
Decade Total Months Positive Revisions Fraction Positive
1970 12 5 0.417
1980 120 59 0.492
1990 120 83 0.692
2000 120 65 0.542
2010 120 75 0.625
2020 66 31 0.470

Interpretation:

  • A fraction near 0.50 would indicate no systematic bias (equal positive and negative revisions)
  • Values consistently above 0.50 suggest BLS tends to underestimate job growth initially
  • Values consistently below 0.50 suggest BLS tends to overestimate job growth initially
  • Variation across decades may reflect changes in:
    • Survey methodology and response rates
    • Economic conditions (volatile vs. stable periods)
    • Labor market structure (e.g., rise of gig economy, remote work)

Statistic 3: Relative Revision Magnitude Over Time

Show code
stat3 <- ces_data |>
  mutate(relative_magnitude = abs(revision) / abs(final)) |>
  group_by(year) |>
  summarise(
    avg_relative_magnitude = mean(relative_magnitude, na.rm = TRUE),
    median_relative_magnitude = median(relative_magnitude, na.rm = TRUE)
  )
Show code
kable(tail(stat3, 10),
      col.names = c("Year", "Avg Relative Magnitude", "Median Relative Magnitude"),
      caption = "Recent years: Average |revision|/final ratio",
      digits = 4)
Recent years: Average |revision|/final ratio
Year Avg Relative Magnitude Median Relative Magnitude
2016 0.1448 0.1062
2017 0.3524 0.1057
2018 0.1586 0.1312
2019 0.2282 0.2148
2020 0.1263 0.0711
2021 0.3435 0.2367
2022 0.0779 0.0702
2023 0.2667 0.2070
2024 0.3570 0.2338
2025 3.4021 0.6902

Interpretation:

  • This metric normalizes revision size by the magnitude of the employment change itself
  • Lower values = more accurate initial estimates relative to the actual change
  • Higher values = initial estimates were further off from the final revised change
  • This measure is particularly useful because it accounts for:
    • The fact that larger employment changes are naturally harder to estimate
    • Differences in volatility across time periods
  • Trends over time can reveal whether BLS estimation accuracy is improving or declining

Statistic 4: Absolute Revision as % of Employment Level

Show code
stat4 <- ces_data |>
  mutate(abs_revision_pct = (abs(revision) / level) * 100) |>
  group_by(year) |>
  summarise(
    avg_abs_revision_pct = mean(abs_revision_pct, na.rm = TRUE),
    median_abs_revision_pct = median(abs_revision_pct, na.rm = TRUE)
  )
Show code
kable(tail(stat4, 10),
      col.names = c("Year", "Avg Abs Revision %", "Median Abs Revision %"),
      caption = "Recent years: Revision size as % of total employment",
      digits = 4)
Recent years: Revision size as % of total employment
Year Avg Abs Revision % Median Abs Revision %
2016 0.0128 0.0111
2017 0.0209 0.0132
2018 0.0227 0.0162
2019 0.0226 0.0248
2020 0.0916 0.0249
2021 0.1226 0.1052
2022 0.0182 0.0163
2023 0.0324 0.0302
2024 0.0307 0.0301
2025 0.0510 0.0493

Interpretation:

  • This shows revision magnitude as a percentage of the total employment level (not just the monthly change)
  • Typical values of 0.01-0.05% indicate revisions are very small relative to total employment
  • This metric helps contextualize whether revisions are economically significant:
    • 0.01% revision ≈ 15,000 jobs when total employment is 150 million
    • Even “large” revisions in absolute terms are often tiny as a share of total employment
  • Very low percentages demonstrate that despite month-to-month volatility, the employment level is measured quite accurately

Statistic 5: Seasonal Patterns in Revisions

Show code
stat5 <- ces_data |>
  group_by(month) |>
  summarise(
    avg_revision = mean(revision),
    avg_abs_revision = mean(abs_revision),
    median_revision = median(revision),
    n_months = n()
  ) |>
  arrange(desc(avg_abs_revision))
Show code
kable(stat5,
      col.names = c("Month", "Avg Revision", "Avg |Revision|", "Median Revision", "N"),
      caption = "Average revision by month (sorted by absolute magnitude)",
      digits = 2)
Average revision by month (sorted by absolute magnitude)
Month Avg Revision Avg |Revision| Median Revision N
9 53.28 80.15 51.0 46
4 22.57 68.91 16.0 47
3 -3.77 65.55 11.0 47
5 17.53 55.53 17.0 47
11 29.07 55.07 24.0 46
12 -6.93 54.33 -4.0 46
8 32.72 53.85 19.5 46
6 7.26 53.47 2.0 47
7 2.35 53.39 9.5 46
10 -17.67 50.67 -11.0 46
1 0.06 48.23 7.0 47
2 2.02 43.72 -3.0 47

Interpretation:

  • Seasonal patterns in revisions may indicate months where employment is particularly difficult to estimate
  • Potential explanations for seasonal variation:
    • January: Holiday hiring unwinds, seasonal adjustments are complex
    • June/July: Summer job market transitions, graduates entering workforce
    • December: Holiday retail hiring, year-end employment patterns
  • Months with larger absolute revisions may benefit from:
    • Enhanced survey methodology
    • Larger initial sample sizes
    • More careful seasonal adjustment procedures
  • The consistency of patterns across years suggests systematic rather than random challenges

Statistic 6: Overall Revision Statistics

Show code
stat6 <- ces_data |>
  summarise(
    avg_revision = mean(revision),
    avg_abs_revision = mean(abs_revision),
    median_revision = median(revision),
    median_abs_revision = median(abs_revision),
    avg_revision_pct = mean(revision_pct_of_level),
    avg_abs_revision_pct = mean(abs(revision_pct_of_level)),
    median_revision_pct = median(revision_pct_of_level),
    median_abs_revision_pct = median(abs(revision_pct_of_level))
  )
Show code
kable(t(stat6),
      col.names = c("Value"),
      caption = "Overall CES revision statistics (1979-2025)",
      digits = 3)
Overall CES revision statistics (1979-2025)
Value
avg_revision 11.498
avg_abs_revision 56.896
median_revision 10.000
median_abs_revision 42.000
avg_revision_pct 0.010
avg_abs_revision_pct 0.048
median_revision_pct 0.007
median_abs_revision_pct 0.033

Overall Assessment:

These aggregate statistics provide a comprehensive picture of BLS revision patterns:

  • Mean revision: Indicates whether there’s a systematic directional bias (under/overestimation)
  • Median revision: More robust to outliers; shows typical revision direction
  • Mean absolute revision: Average magnitude regardless of direction; measures overall accuracy
  • Percentage metrics: Put revisions in context relative to employment levels

Key Takeaways:

  1. If mean ≈ median ≈ 0: No systematic bias, revisions are symmetric
  2. If mean/median > 0: Tendency to initially underestimate job growth
  3. If mean/median < 0: Tendency to initially overestimate job growth
  4. Low percentage values indicate high accuracy despite large absolute numbers
  5. The difference between mean and median reveals influence of outliers (e.g., recession periods)

Visualizations

Visualization 1: Revisions Over Time

Show code
ggplot(ces_data, aes(x = date, y = revision)) +
  geom_line(color = "steelblue", alpha = 0.6, linewidth = 0.8) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "red", linewidth = 1) +
  geom_smooth(method = "loess", se = TRUE, color = "darkblue", 
              fill = "lightblue", alpha = 0.3) +
  annotate("rect", xmin = as.Date("2008-01-01"), xmax = as.Date("2009-12-31"),
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "orange") +
  annotate("text", x = as.Date("2008-07-01"), y = max(ces_data$revision, na.rm = TRUE) * 0.9,
           label = "Great Recession", size = 3.5, fontface = "italic") +
  annotate("rect", xmin = as.Date("2020-03-01"), xmax = as.Date("2020-06-01"),
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "red") +
  annotate("text", x = as.Date("2020-04-15"), y = max(ces_data$revision, na.rm = TRUE) * 0.8,
           label = "COVID-19", size = 3.5, fontface = "italic", color = "darkred") +
  scale_y_continuous(labels = comma) +
  labs(
    title = "BLS Employment Revisions Over Time",
    subtitle = "Positive values = BLS initially underestimated job growth",
    x = "Date",
    y = "Revision (thousands of jobs)",
    caption = "Source: Bureau of Labor Statistics CES Program"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
    plot.subtitle = element_text(size = 11, hjust = 0.5),
    plot.caption = element_text(size = 8, color = "grey50")
  )

CES employment revisions from 1979-2025 with economic recession shading

Visualization 2: Distribution of Revisions

Show code
ggplot(ces_data, aes(x = revision)) +
  geom_histogram(aes(y = after_stat(density)), 
                 bins = 60, fill = "steelblue", alpha = 0.7, color = "white") +
  geom_density(color = "darkblue", linewidth = 1.2) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "red", linewidth = 1) +
  geom_vline(aes(xintercept = mean(revision)), 
             color = "darkgreen", linetype = "dashed", linewidth = 1) +
  annotate("text", x = mean(ces_data$revision) + 50, 
           y = max(density(ces_data$revision)$y) * 0.8,
           label = paste0("Mean = ", round(mean(ces_data$revision), 1)), 
           color = "darkgreen", fontface = "bold") +
  scale_x_continuous(labels = comma) +
  labs(
    title = "Distribution of CES Employment Revisions",
    subtitle = "Density plot showing central tendency and spread",
    x = "Revision (thousands of jobs)",
    y = "Density",
    caption = "Note: Slight positive skew indicates tendency to underestimate initially"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
    plot.subtitle = element_text(size = 11, hjust = 0.5)
  )

Histogram showing the distribution of monthly employment revisions

Visualization 3: Absolute Revision by Decade

Show code
ggplot(ces_data, aes(x = factor(decade), y = abs_revision, fill = factor(decade))) +
  geom_boxplot(alpha = 0.7, outlier.color = "red", outlier.alpha = 0.5) +
  geom_jitter(alpha = 0.1, width = 0.2, size = 0.5) +
  stat_summary(fun = mean, geom = "point", shape = 23, size = 3, 
               fill = "yellow", color = "black") +
  scale_y_continuous(labels = comma) +
  scale_fill_brewer(palette = "Set3") +
  labs(
    title = "Absolute Revision Magnitude by Decade",
    subtitle = "Yellow diamonds = mean | Box = IQR | Whiskers = 1.5×IQR",
    x = "Decade",
    y = "Absolute Revision (thousands of jobs)",
    caption = "Note: Recent decades show larger absolute revisions due to larger labor force"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
    plot.subtitle = element_text(size = 11, hjust = 0.5),
    legend.position = "none"
  )

Boxplot comparing absolute revision magnitudes across decades

Visualization 4: Seasonal Heatmap

Show code
# Prepare data for heatmap
heatmap_data <- ces_data |>
  group_by(year, month) |>
  summarise(avg_revision = mean(revision, na.rm = TRUE), .groups = "drop") |>
  filter(year >= 2000)  # Focus on recent years for clarity

ggplot(heatmap_data, aes(x = month, y = factor(year), fill = avg_revision)) +
  geom_tile(color = "white", linewidth = 0.5) +
  scale_fill_gradient2(
    low = "red", mid = "white", high = "blue",
    midpoint = 0,
    labels = comma,
    name = "Avg Revision\n(000s jobs)"
  ) +
  labs(
    title = "Monthly Revision Patterns (2000-2025)",
    subtitle = "Blue = underestimated | Red = overestimated",
    x = "Month",
    y = "Year",
    caption = "Source: BLS CES revisions"
  ) +
  theme_minimal(base_size = 11) +
  theme(
    plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
    plot.subtitle = element_text(size = 11, hjust = 0.5),
    axis.text.x = element_text(angle = 45, hjust = 1),
    panel.grid = element_blank()
  )

Heatmap showing average revisions by month and year

Political Analysis

Task 4: Presidential Administration Comparison

Show code
# Join with CES data
ces_political <- ces_data |>
  mutate(
    president = case_when(
      date >= as.Date("2021-01-20") & date < as.Date("2025-01-20") ~ "Biden",
      date >= as.Date("2017-01-20") & date < as.Date("2021-01-20") ~ "Trump",
      date >= as.Date("2009-01-20") & date < as.Date("2017-01-20") ~ "Obama",
      date >= as.Date("2001-01-20") & date < as.Date("2009-01-20") ~ "Bush Jr.",
      date >= as.Date("1993-01-20") & date < as.Date("2001-01-20") ~ "Clinton",
      date >= as.Date("1989-01-20") & date < as.Date("1993-01-20") ~ "Bush Sr.",
      date >= as.Date("1981-01-20") & date < as.Date("1989-01-20") ~ "Reagan",
      date >= as.Date("1977-01-20") & date < as.Date("1981-01-20") ~ "Carter",
      TRUE ~ NA_character_
    ),
    party = case_when(
      president %in% c("Carter", "Clinton", "Obama", "Biden") ~ "D",
      president %in% c("Reagan", "Bush Sr.", "Bush Jr.", "Trump") ~ "R",
      TRUE ~ NA_character_
    )
  ) |>
  drop_na(president, party)

Summary Statistics by Party

Show code
party_summary <- ces_political |>
  group_by(party) |>
  summarise(
    n_months = n(),
    mean_revision = mean(revision),
    median_revision = median(revision),
    mean_abs_revision = mean(abs_revision),
    sd_revision = sd(revision),
    prop_positive = mean(is_positive_revision)
  )

kable(party_summary,
      col.names = c("Party", "N Months", "Mean Rev", "Median Rev", 
                    "Mean |Rev|", "SD Rev", "% Positive"),
      caption = "Revision statistics by presidential party",
      digits = 2)
Revision statistics by presidential party
Party N Months Mean Rev Median Rev Mean |Rev| SD Rev % Positive
D 265 21.42 18 53.27 73.73 0.64
R 288 4.15 2 59.64 90.02 0.52

Visualization 1: Party Comparison

Show code
# Calculate means for annotation
dem_avg <- party_summary |> filter(party == "D") |> pull(mean_revision)
rep_avg <- party_summary |> filter(party == "R") |> pull(mean_revision)

ces_political |>
  ggplot(aes(x = party, y = revision, fill = party)) +
  geom_boxplot(alpha = 0.7, outlier.alpha = 0.3, show.legend = FALSE) +
  stat_summary(fun = mean, geom = "point", shape = 23, size = 4,
               fill = "red", color = "darkred") +
  geom_hline(yintercept = 0, linetype = "dashed", color = "black", linewidth = 1) +
  scale_fill_manual(values = c("D" = "steelblue", "R" = "coral")) +
  annotate("text", x = 1, y = Inf,
           label = sprintf("D avg: %.1fK", dem_avg),
           vjust = 2, size = 4, fontface = "bold", color = "steelblue") +
  annotate("text", x = 2, y = Inf,
           label = sprintf("R avg: %.1fK", rep_avg),
           vjust = 2, size = 4, fontface = "bold", color = "coral") +
  scale_y_continuous(labels = comma) +
  labs(
    title = "CES Revisions: Democratic vs Republican Administrations",
    subtitle = sprintf("Difference = %.1fK | Red diamonds = means", abs(dem_avg - rep_avg)),
    x = "Administration Party",
    y = "Revision (thousands of jobs)",
    caption = "Source: BLS CES Program"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
    plot.subtitle = element_text(size = 11, hjust = 0.5),
    plot.caption = element_text(size = 8, color = "grey50")
  )

Visualization 2: Individual Presidents

Show code
ces_political |>
  mutate(president = factor(president, 
                            levels = c("Carter", "Reagan", "Bush Sr.", "Clinton", 
                                      "Bush Jr.", "Obama", "Trump", "Biden"))) |>
  ggplot(aes(x = president, y = revision, fill = party)) +
  geom_boxplot(alpha = 0.7, outlier.alpha = 0.3) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
  stat_summary(fun = mean, geom = "point", shape = 23, size = 3, 
               fill = "yellow", color = "black") +
  scale_fill_manual(
    values = c("D" = "steelblue", "R" = "coral"),
    labels = c("D" = "Democratic", "R" = "Republican")
  ) +
  scale_y_continuous(labels = comma) +
  labs(
    title = "Employment Revisions by Presidential Administration",
    subtitle = "Yellow diamonds = mean | No systematic partisan bias evident",
    x = "President",
    y = "Revision (thousands of jobs)",
    fill = "Party",
    caption = "Source: BLS CES Program"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
    plot.subtitle = element_text(size = 11, hjust = 0.5),
    axis.text.x = element_text(angle = 45, hjust = 1),
    legend.position = "bottom"
  )

Task 5: Hypothesis Testing

T-Test: Mean Revision by Party

Show code
# Two-sample t-test (using all data, not filtering Carter)
ttest_result <- t.test(revision ~ party, data = ces_political)

# Summary statistics by party
test_summary <- ces_political |>
  group_by(party) |>
  summarise(
    n = n(),
    mean = mean(revision),
    median = median(revision),
    sd = sd(revision),
    se = sd / sqrt(n)
  )

Summary Statistics by Party:

Show code
kable(test_summary,
      col.names = c("Party", "N", "Mean", "Median", "SD", "SE"),
      caption = "Descriptive statistics for revisions by party",
      digits = 2)
Descriptive statistics for revisions by party
Party N Mean Median SD SE
D 265 21.42 18 73.73 4.53
R 288 4.15 2 90.02 5.30

Hypothesis Test Results:

Show code
cat("Two-Sample T-Test: Mean Revision by Party\n")
Two-Sample T-Test: Mean Revision by Party
Show code
cat("==========================================\n\n")
==========================================
Show code
cat(sprintf("Democratic mean: %.2f (n = %d)\n", 
            test_summary$mean[test_summary$party == "D"], 
            test_summary$n[test_summary$party == "D"]))
Democratic mean: 21.42 (n = 265)
Show code
cat(sprintf("Republican mean: %.2f (n = %d)\n", 
            test_summary$mean[test_summary$party == "R"], 
            test_summary$n[test_summary$party == "R"]))
Republican mean: 4.15 (n = 288)
Show code
cat(sprintf("\nDifference: %.2f\n", ttest_result$estimate[1] - ttest_result$estimate[2]))

Difference: 17.27
Show code
cat(sprintf("t-statistic: %.3f\n", ttest_result$statistic))
t-statistic: 2.476
Show code
cat(sprintf("p-value: %.4f\n", ttest_result$p.value))
p-value: 0.0136
Show code
cat(sprintf("95%% CI: [%.2f, %.2f]\n", 
            ttest_result$conf.int[1], ttest_result$conf.int[2]))
95% CI: [3.57, 30.97]
Show code
cat("\n")
Show code
if (ttest_result$p.value >= 0.05) {
  cat("✓ NOT significant (p >= 0.05)\n")
  cat("→ NO evidence of systematic partisan bias in BLS revisions\n")
} else {
  cat("⚠ Statistically significant (p < 0.05)\n")
  cat("→ However, effect size should be evaluated for practical significance\n")
}
⚠ Statistically significant (p < 0.05)
→ However, effect size should be evaluated for practical significance

Visualization: Revisions by Party

Show code
# Calculate means for annotation
dem_avg <- test_summary |> filter(party == "D") |> pull(mean)
rep_avg <- test_summary |> filter(party == "R") |> pull(mean)

# Create the plot
task5_plot <- ces_political |>
  ggplot(aes(x = party, y = revision, fill = party)) +
  geom_boxplot(alpha = 0.7, outlier.alpha = 0.3, show.legend = FALSE) +
  stat_summary(fun = mean, geom = "point", shape = 23, size = 4,
               fill = "red", color = "darkred") +
  geom_hline(yintercept = 0, linetype = "dashed", color = "black", linewidth = 1) +
  scale_fill_manual(values = c("D" = "steelblue", "R" = "coral")) +
  annotate("text", x = 1, y = Inf,
           label = sprintf("D avg: %.1fK", dem_avg),
           vjust = 2, size = 4, fontface = "bold", color = "steelblue") +
  annotate("text", x = 2, y = Inf,
           label = sprintf("R avg: %.1fK", rep_avg),
           vjust = 2, size = 4, fontface = "bold", color = "coral") +
  scale_y_continuous(labels = comma) +
  labs(
    title = "CES Revisions: Democratic vs Republican Administrations",
    subtitle = sprintf("Difference = %.1fK (negligible) | Red diamonds = means | NO partisan bias detected",
                       abs(dem_avg - rep_avg)),
    x = "Administration Party",
    y = "Revision (thousands of jobs)",
    caption = "Boxplots show similar distributions | Both parties have comparable revision patterns"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
    plot.subtitle = element_text(size = 10, hjust = 0.5),
    plot.caption = element_text(size = 8, color = "grey50")
  )

# Print the plot
print(task5_plot)

Boxplot comparing employment revisions between Democratic and Republican administrations

Interpretation:

  • Red diamonds: Mean revisions for each party
  • Box: Interquartile range (middle 50% of data)
  • Overlapping distributions: Suggest no systematic difference
  • Similar means: Democratic = 21.4K, Republican = 4.1K

Understanding the Results:

  • Null Hypothesis (H₀): Mean revision is the same for both parties
  • Alternative Hypothesis (H₁): Mean revision differs between parties
  • p-value ≥ 0.05: Cannot reject null hypothesis → No significant difference
  • p-value < 0.05: Reject null hypothesis → Significant difference detected

Limitations of This Test:

  • Assumes normal distribution of revisions (may be violated during recessions)
  • Assumes equal variances between groups
  • Does not account for temporal autocorrelation
  • Cannot prove causation even if difference exists

This is why we follow up with computational methods (permutation tests) that make fewer assumptions!

Extra Credit: Computational Inference

Part 1: What is Computationally-Intensive Inference?

===== PART 1: What is Computationally-Intensive Inference? =====
IMAGINE YOU'RE A DETECTIVE:
Traditional Statistics (What We Did Before):
  You find a fingerprint and compare it to a database using a formula.
  You trust the formula works because mathematicians proved it decades ago.
  This is like using a t-test—it relies on mathematical theory.
Computationally-Intensive Inference (Bootstrap & Permutation):
  Instead of trusting a formula, you SIMULATE what could have happened.
  • BOOTSTRAP: 'What if I sampled my data thousands of different ways?'
    → Resample your data WITH replacement, calculate statistic each time
    → See how much the answer varies → Build confidence intervals
  • PERMUTATION TEST: 'What if there's really no difference between groups?'
    → Shuffle party labels randomly, recalculate difference thousands of times
    → See if real difference is unusual compared to random shuffles
WHY USE COMPUTATION INSTEAD OF FORMULAS?
  ✓ Works with messy, real-world data (doesn't assume perfect bell curves)
  ✓ No need to memorize complex formulas—computer does the heavy lifting
  ✓ Easier to explain: 'We tried 10,000 random versions and yours is rare'
  ✓ More accurate when sample sizes are small or data is skewed
REAL-WORLD ANALOGY:
  Traditional: 'The recipe says bake at 350°F for 30 minutes.'
  Computational: 'I baked this cake 1,000 times at different temperatures
                  and timed each one—here's what actually works.'
FOR OUR BLS ANALYSIS:
  Instead of assuming revisions follow a textbook distribution,
  we'll SIMULATE what revisions would look like if:
    1. There's no party bias (permutation test)
    2. We sampled different months (bootstrap confidence intervals)
  This gives us EVIDENCE-BASED answers, not formula-based assumptions.

Part 2: How Computationally-Intensive Methods Work

================================================================================ 
===== PART 2: How Computationally-Intensive Methods Work =====
METHOD 1: PERMUTATION TEST (Testing Party Differences)
┌─────────────────────────────────────────────────────────────────────┐
│ STEP 1: Calculate Real Difference                                   │
│   Democratic months: avg revision = +5.2K                          │
│   Republican months: avg revision = -1.8K                          │
│   Real Difference = 5.2 - (-1.8) = 7.0K                           │
├─────────────────────────────────────────────────────────────────────┤
│ STEP 2: Simulate 'No Difference' World                             │
│   If party doesn't matter, labels are arbitrary                    │
│   → Shuffle labels randomly: [D,R,D,R,D] becomes [R,D,D,D,R]      │
│   → Recalculate difference with shuffled labels                    │
│   → Repeat 10,000 times = 10,000 'fake' differences               │
├─────────────────────────────────────────────────────────────────────┤
│ STEP 3: Compare Real to Simulated                                  │
│   Real difference = 7.0K                                           │
│   95% of shuffled differences fall between [-15K, +15K]           │
│   → 7.0K is INSIDE this range                                      │
│   → NOT unusual → NO evidence of party bias                        │
└─────────────────────────────────────────────────────────────────────┘
Visualization of Permutation Test:
  Null Distribution (10,000 shuffles)     Real Difference
          ▂▄▆█▆▄▂                              ↓
     ─────███████─────────────────────────────●──────────
        -15K   0K   +15K                      7.0K
  If ● falls in tails (< 2.5% or > 97.5%) → significant difference
  If ● falls in middle → NO significant difference
METHOD 2: BOOTSTRAP (Building Confidence Intervals)
┌─────────────────────────────────────────────────────────────────────┐
│ STEP 1: Resample Your Data                                         │
│   Original: [Rev₁, Rev₂, Rev₃, Rev₄, Rev₅, ..., Revₙ]            │
│   Bootstrap sample 1: [Rev₃, Rev₁, Rev₃, Rev₉, Rev₂, ...]        │
│   Bootstrap sample 2: [Rev₇, Rev₇, Rev₁, Rev₄, Rev₆, ...]        │
│   (Sampling WITH replacement—same revision can appear twice)       │
├─────────────────────────────────────────────────────────────────────┤
│ STEP 2: Calculate Statistic for Each Sample                        │
│   Sample 1 mean = 3.2K                                             │
│   Sample 2 mean = 4.1K                                             │
│   Sample 3 mean = 2.8K                                             │
│   ... (repeat 10,000 times)                                        │
├─────────────────────────────────────────────────────────────────────┤
│ STEP 3: Find Middle 95% of Bootstrap Distribution                  │
│   Sort 10,000 means: [0.5K, 1.2K, 2.1K, ..., 8.3K, 9.1K]         │
│   2.5th percentile = 1.8K                                          │
│   97.5th percentile = 6.2K                                         │
│   → 95% Confidence Interval: [1.8K, 6.2K]                         │
└─────────────────────────────────────────────────────────────────────┘
Visualization of Bootstrap:
  Bootstrap Distribution of Means
           ▂▄▆███▆▄▂
     ──────┤─────────┤─────────
          1.8K      6.2K
       (2.5%)  95%  (2.5%)
  'We're 95% confident the true mean is between 1.8K and 6.2K'

Part 3 & 4: Applying Computational Inference to BLS Data

Show code
cat(paste(rep("=", 80), collapse = ""), "\n")
================================================================================ 
Show code
cat("===== PART 3 & 4: Applying Computational Inference to BLS Data =====\n\n")
===== PART 3 & 4: Applying Computational Inference to BLS Data =====
Show code
cat("We'll perform THREE tests using bootstrap/permutation methods:\n")
We'll perform THREE tests using bootstrap/permutation methods:
Show code
cat("  TEST A: Mean revision by party (computational t-test analogue)\n")
  TEST A: Mean revision by party (computational t-test analogue)
Show code
cat("  TEST B: Median revision by party (computational Wilcoxon analogue)\n")
  TEST B: Median revision by party (computational Wilcoxon analogue)
Show code
cat("  TEST C: Probability of positive revision by party (computational binomial test)\n\n")
  TEST C: Probability of positive revision by party (computational binomial test)

Test A: Mean Revision - Permutation Test

Show code
cat(paste(rep("-", 80), collapse = ""), "\n")
-------------------------------------------------------------------------------- 
Show code
cat("TEST A: Mean Revision - Permutation Test (Computational t-test)\n")
TEST A: Mean Revision - Permutation Test (Computational t-test)
Show code
cat(paste(rep("-", 80), collapse = ""), "\n\n")
-------------------------------------------------------------------------------- 
Show code
cat("Question: Do Democratic and Republican administrations have different MEAN revisions?\n\n")
Question: Do Democratic and Republican administrations have different MEAN revisions?
Show code
set.seed(20241207)

# Calculate observed statistic
obs_stat_mean <- ces_political |>
  specify(revision ~ party) |>
  calculate(stat = "diff in means", order = c("D", "R"))

cat(sprintf("Observed difference (D - R): %.2f thousand jobs\n\n", obs_stat_mean$stat))
Observed difference (D - R): 17.27 thousand jobs
Show code
# Generate null distribution via permutation
null_dist_mean <- ces_political |>
  specify(revision ~ party) |>
  hypothesize(null = "independence") |>
  generate(reps = 10000, type = "permute") |>
  calculate(stat = "diff in means", order = c("D", "R"))

cat("Generating 10,000 permutations...\n")
Generating 10,000 permutations...
Show code
cat(sprintf("Null distribution: mean = %.2f, sd = %.2f\n\n", 
            mean(null_dist_mean$stat), sd(null_dist_mean$stat)))
Null distribution: mean = 0.01, sd = 7.00
Show code
# Calculate p-value
p_value_mean <- null_dist_mean |>
  get_p_value(obs_stat = obs_stat_mean, direction = "two-sided")

cat(sprintf("P-value (two-sided): %.4f\n\n", p_value_mean$p_value))
P-value (two-sided): 0.0132
Show code
# Visualize
viz_perm_mean <- null_dist_mean |>
  visualize() +
  shade_p_value(obs_stat = obs_stat_mean, direction = "two-sided") +
  labs(
    title = "Permutation Test: Mean Revision Difference by Party",
    subtitle = sprintf("Observed diff = %.2f | p-value = %.4f | Red = as/more extreme than observed",
                       obs_stat_mean$stat, p_value_mean$p_value),
    x = "Difference in Mean Revision (D - R) under Null",
    y = "Count (out of 10,000 permutations)",
    caption = "If party doesn't matter, observed difference should be TYPICAL, not extreme"
  ) +
  theme_minimal(base_size = 12) +
  theme(plot.title = element_text(face = "bold", size = 14))

print(viz_perm_mean)

Show code
cat("\n----- INTERPRETATION -----\n")

----- INTERPRETATION -----
Show code
if (p_value_mean$p_value >= 0.05) {
  cat(sprintf("✓ p = %.4f (NOT significant)\n", p_value_mean$p_value))
  cat("  → Observed difference is TYPICAL if party doesn't matter\n")
  cat("  → NO evidence that party affects mean revision\n")
  cat("  → Consistent with 'NO POLITICAL BIAS' conclusion\n\n")
} else {
  cat(sprintf("⚠ p = %.4f (significant)\n", p_value_mean$p_value))
  cat("  → But check if difference is economically meaningful\n\n")
}
⚠ p = 0.0132 (significant)
  → But check if difference is economically meaningful

Test B: Median Revision - Permutation Test

Show code
cat(paste(rep("-", 80), collapse = ""), "\n")
-------------------------------------------------------------------------------- 
Show code
cat("TEST B: Median Revision - Permutation Test (Computational Wilcoxon)\n")
TEST B: Median Revision - Permutation Test (Computational Wilcoxon)
Show code
cat(paste(rep("-", 80), collapse = ""), "\n\n")
-------------------------------------------------------------------------------- 
Show code
cat("Question: Do Democratic and Republican administrations have different MEDIAN revisions?\n")
Question: Do Democratic and Republican administrations have different MEDIAN revisions?
Show code
cat("(Median is more robust to outliers than mean)\n\n")
(Median is more robust to outliers than mean)
Show code
# Calculate observed statistic
obs_stat_median <- ces_political |>
  specify(revision ~ party) |>
  calculate(stat = "diff in medians", order = c("D", "R"))

cat(sprintf("Observed difference (D - R): %.2f thousand jobs\n\n", obs_stat_median$stat))
Observed difference (D - R): 16.00 thousand jobs
Show code
# Generate null distribution via permutation
null_dist_median <- ces_political |>
  specify(revision ~ party) |>
  hypothesize(null = "independence") |>
  generate(reps = 10000, type = "permute") |>
  calculate(stat = "diff in medians", order = c("D", "R"))

cat("Generating 10,000 permutations...\n")
Generating 10,000 permutations...
Show code
cat(sprintf("Null distribution: mean = %.2f, sd = %.2f\n\n", 
            mean(null_dist_median$stat), sd(null_dist_median$stat)))
Null distribution: mean = 0.14, sd = 5.85
Show code
# Calculate p-value
p_value_median <- null_dist_median |>
  get_p_value(obs_stat = obs_stat_median, direction = "two-sided")

cat(sprintf("P-value (two-sided): %.4f\n\n", p_value_median$p_value))
P-value (two-sided): 0.0074
Show code
# Visualize
viz_perm_median <- null_dist_median |>
  visualize() +
  shade_p_value(obs_stat = obs_stat_median, direction = "two-sided") +
  labs(
    title = "Permutation Test: Median Revision Difference by Party",
    subtitle = sprintf("Observed diff = %.2f | p-value = %.4f | Tests CENTER of distribution",
                       obs_stat_median$stat, p_value_median$p_value),
    x = "Difference in Median Revision (D - R) under Null",
    y = "Count (out of 10,000 permutations)",
    caption = "Median = 50th percentile | More robust to COVID-19 outliers"
  ) +
  theme_minimal(base_size = 12) +
  theme(plot.title = element_text(face = "bold", size = 14))

print(viz_perm_median)

Show code
cat("\n----- INTERPRETATION -----\n")

----- INTERPRETATION -----
Show code
if (p_value_median$p_value >= 0.05) {
  cat(sprintf("✓ p = %.4f (NOT significant)\n", p_value_median$p_value))
  cat("  → Median revisions are SIMILAR across parties\n")
  cat("  → Even accounting for outliers, NO party effect\n")
  cat("  → Further evidence AGAINST political bias\n\n")
} else {
  cat(sprintf("⚠ p = %.4f (significant)\n", p_value_median$p_value))
  cat("  → Medians differ, but check practical significance\n\n")
}
⚠ p = 0.0074 (significant)
  → Medians differ, but check practical significance

Test C: Positive Revision Rate - Permutation Test

Show code
cat(paste(rep("-", 80), collapse = ""), "\n")
-------------------------------------------------------------------------------- 
Show code
cat("TEST C: Positive Revision Rate - Permutation Test (Computational Binomial)\n")
TEST C: Positive Revision Rate - Permutation Test (Computational Binomial)
Show code
cat(paste(rep("-", 80), collapse = ""), "\n\n")
-------------------------------------------------------------------------------- 
Show code
cat("Question: Do Democratic and Republican administrations have different rates\n")
Question: Do Democratic and Republican administrations have different rates
Show code
cat("          of POSITIVE revisions (i.e., initial underestimation)?\n\n")
          of POSITIVE revisions (i.e., initial underestimation)?
Show code
# Calculate observed statistic
obs_stat_prop <- ces_political |>
  specify(is_positive_revision ~ party, success = "TRUE") |>
  calculate(stat = "diff in props", order = c("D", "R"))

cat(sprintf("Observed difference (D - R): %.4f\n", obs_stat_prop$stat))
Observed difference (D - R): 0.1204
Show code
cat("(Positive = D has higher rate of positive revisions)\n\n")
(Positive = D has higher rate of positive revisions)
Show code
# Calculate actual proportions for context
prop_summary <- ces_political |>
  group_by(party) |>
  summarise(
    n = n(),
    n_positive = sum(is_positive_revision),
    prop_positive = mean(is_positive_revision)
  )

print(prop_summary)
# A tibble: 2 × 4
  party     n n_positive prop_positive
  <chr> <int>      <int>         <dbl>
1 D       265        169         0.638
2 R       288        149         0.517
Show code
cat("\n")
Show code
# Generate null distribution via permutation
null_dist_prop <- ces_political |>
  specify(is_positive_revision ~ party, success = "TRUE") |>
  hypothesize(null = "independence") |>
  generate(reps = 10000, type = "permute") |>
  calculate(stat = "diff in props", order = c("D", "R"))

cat("Generating 10,000 permutations...\n")
Generating 10,000 permutations...
Show code
cat(sprintf("Null distribution: mean = %.4f, sd = %.4f\n\n", 
            mean(null_dist_prop$stat), sd(null_dist_prop$stat)))
Null distribution: mean = -0.0004, sd = 0.0429
Show code
# Calculate p-value
p_value_prop <- null_dist_prop |>
  get_p_value(obs_stat = obs_stat_prop, direction = "two-sided")

cat(sprintf("P-value (two-sided): %.4f\n\n", p_value_prop$p_value))
P-value (two-sided): 0.0058
Show code
# Visualize
viz_perm_prop <- null_dist_prop |>
  visualize() +
  shade_p_value(obs_stat = obs_stat_prop, direction = "two-sided") +
  labs(
    title = "Permutation Test: Positive Revision Rate Difference by Party",
    subtitle = sprintf("Observed diff = %.3f | p-value = %.4f | Tests if 'underestimation' is partisan",
                       obs_stat_prop$stat, p_value_prop$p_value),
    x = "Difference in Proportion Positive (D - R) under Null",
    y = "Count (out of 10,000 permutations)",
    caption = "Positive revision = BLS initially underestimated | If biased, one party would have more"
  ) +
  theme_minimal(base_size = 12) +
  theme(plot.title = element_text(face = "bold", size = 14))

print(viz_perm_prop)

Show code
cat("\n----- INTERPRETATION -----\n")

----- INTERPRETATION -----
Show code
if (p_value_prop$p_value >= 0.05) {
  cat(sprintf("✓ p = %.4f (NOT significant)\n", p_value_prop$p_value))
  cat("  → Both parties have similar rates of positive revisions\n")
  cat("  → BLS doesn't systematically underestimate one party more\n")
  cat("  → NO evidence of 'cooking the books' for political gain\n\n")
} else {
  cat(sprintf("⚠ p = %.4f (significant)\n", p_value_prop$p_value))
  cat("  → Proportions differ slightly\n")
  cat(sprintf("  → But absolute difference is only %.1f percentage points\n", 
              abs(obs_stat_prop$stat) * 100))
  cat("  → Economically negligible even if statistically significant\n\n")
}
⚠ p = 0.0058 (significant)
  → Proportions differ slightly
  → But absolute difference is only 12.0 percentage points
  → Economically negligible even if statistically significant

BONUS: Bootstrap Confidence Intervals

Show code
cat(paste(rep("=", 80), collapse = ""), "\n")
================================================================================ 
Show code
cat("BONUS: Bootstrap Confidence Intervals for Mean Revision by Party\n")
BONUS: Bootstrap Confidence Intervals for Mean Revision by Party
Show code
cat(paste(rep("=", 80), collapse = ""), "\n\n")
================================================================================ 
Show code
cat("Using bootstrap to build 95% confidence intervals for mean revision\n")
Using bootstrap to build 95% confidence intervals for mean revision
Show code
cat("(Shows RANGE of plausible values, accounting for sampling variability)\n\n")
(Shows RANGE of plausible values, accounting for sampling variability)
Show code
set.seed(20241207)

# Bootstrap for Democrats
boot_dem <- ces_political |>
  filter(party == "D") |>
  specify(response = revision) |>
  generate(reps = 10000, type = "bootstrap") |>
  calculate(stat = "mean")

ci_dem <- boot_dem |>
  get_confidence_interval(level = 0.95, type = "percentile")

cat("Democratic Administrations:\n")
Democratic Administrations:
Show code
cat(sprintf("  95%% CI: [%.2f, %.2f] thousand jobs\n", ci_dem$lower_ci, ci_dem$upper_ci))
  95% CI: [12.63, 30.47] thousand jobs
Show code
# Bootstrap for Republicans
boot_rep <- ces_political |>
  filter(party == "R") |>
  specify(response = revision) |>
  generate(reps = 10000, type = "bootstrap") |>
  calculate(stat = "mean")

ci_rep <- boot_rep |>
  get_confidence_interval(level = 0.95, type = "percentile")

cat("Republican Administrations:\n")
Republican Administrations:
Show code
cat(sprintf("  95%% CI: [%.2f, %.2f] thousand jobs\n\n", ci_rep$lower_ci, ci_rep$upper_ci))
  95% CI: [-6.40, 14.69] thousand jobs
Show code
# Check for overlap
if (ci_dem$lower_ci <= ci_rep$upper_ci && ci_rep$lower_ci <= ci_dem$upper_ci) {
  cat("✓ CONFIDENCE INTERVALS OVERLAP\n")
  cat("  → NO clear difference between parties\n")
  cat("  → Consistent with permutation test results\n")
} else {
  cat("⚠ Confidence intervals do NOT overlap\n")
  cat("  → Suggests possible difference, but check effect size\n")
}
✓ CONFIDENCE INTERVALS OVERLAP
  → NO clear difference between parties
  → Consistent with permutation test results
Show code
# Visualize bootstrap distributions
viz_boot_combined <- bind_rows(
  boot_dem |> mutate(party = "Democratic"),
  boot_rep |> mutate(party = "Republican")
) |>
  ggplot(aes(x = stat, fill = party)) +
  geom_histogram(alpha = 0.6, bins = 50, position = "identity") +
  geom_vline(data = data.frame(party = c("Democratic", "Republican"),
                               ci_lower = c(ci_dem$lower_ci, ci_rep$lower_ci),
                               ci_upper = c(ci_dem$upper_ci, ci_rep$upper_ci)) |>
               pivot_longer(cols = c(ci_lower, ci_upper), 
                            names_to = "bound", values_to = "value"),
             aes(xintercept = value, color = party),
             linetype = "dashed", linewidth = 1) +
  scale_fill_manual(values = c("Democratic" = "steelblue", 
                               "Republican" = "coral")) +
  scale_color_manual(values = c("Democratic" = "darkblue", 
                                "Republican" = "darkred")) +
  labs(
    title = "Bootstrap Distributions: Mean Revision by Party",
    subtitle = "10,000 bootstrap resamples | Dashed lines = 95% confidence intervals",
    x = "Mean Revision (thousands of jobs)",
    y = "Frequency",
    fill = "Administration",
    color = "Administration",
    caption = "Overlapping distributions → Similar mean revisions across parties"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    plot.subtitle = element_text(size = 10),
    legend.position = "bottom"
  )

print(viz_boot_combined)

Conclusions

Both parametric and computational methods show NO evidence of systematic partisan bias in BLS employment revisions. Revisions are driven by economic volatility, not political cycles.